IAES International Journal of Artificial Intelligence (IJ-AI) 
Vol. 8, No. 3, September 2019, pp. 244~251 
ISSN: 2252-8938, DOI: 10.1159 1/ijai.v8.i3.pp244-251 o 244 


Expert judgment Z-Numbers as a ranking indicator for 
hierarchical fuzzy logic system 


Shaiful Bakhtiar bin Rodzman!, Normaly Kamal Ismail?, Nurazzah Abd Rahman“, Syed Ahmad 
Aljunid*, Zulhilmi Mohamed Nor*, Ku Muhammad Naim Ku Khalif® 
'23.4Raculty of Computer & Mathematical Sciences, Universiti Teknologi MARA, Shah Alam, Malaysia 
*Fakulti Pengajian Quran dan Sunnah, Universiti Sains Islam Malaysia, Bandar Baru Nilai, Malaysia 
Faculty of Industrial Sciences and Technology, Universiti Malaysia Pahang, Malaysia 


Article Info 


ABSTRACT 


Article history: 


Received Apr 30, 2019 
Revised Jul 30, 2019 
Accepted Aug 22, 2019 


Keywords: 


BM25 model 

Domain specific text retrieval 
Expert judgment 

Fuzzy logic 

Ranking indicator 
Z-Numbers 


In this article, the researchers main contribution is to investigate three factors 
which may correlate in implementation of Expert Judgment Z-Numbers as 
new Fuzzy Logic Ranking Indicator such as: expert relevance judgment or 
score, the expert confidence and the level of expertise. The Expert Judgment 
Z-Numbers then will be an input to the Hierarchical Fuzzy Logic System of 
Domain Specific Text Retrieval, along with other indicators such as 
Ontology BM25 Score, Fabrication Rate, Shia Rate and Positive Rate of 
hadith document. The results showed, the proposed system, with the 
additional new indicator of Expert Judgment Z-Numbers, may improve the 
original BM25 ranking function, by yielding better results on 26 queries, on 
all evaluation metrics that are measured in this research such as P@10, 
%no measures and MAP, and has achieved better results in 28 queries on 
P@10 alone, compared to the BM25 original score, that only yield better 
results in 2 queries on all evaluation metrics, and also yield better results in 4 
queries on the MAP alone. The results proved that the proposed system has a 
capability to utilize the expert confidence and their relevant judgment that are 
represented in Z-Number, as an indicator to optimize the existing ranking 
function system and has a potential for a further research to be conducted on 
these domains. For the future works, the researchers would like to enhance 
this research by including a variety of expert’s level confidence and their 
judgment, also a new calculation to represent the value of Z-Numbers. 
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1. INTRODUCTION 


Crowdsourcing offers an affordable and scalable mean to collect relevance judgments for 
Information Retrieval (IR) test collections. However, crowd assessors may show higher variance in judgment 
quality than trusted assessors, (that the researchers referred to in this study as the experts) [1]. 
The exploration of this issue has becoming vital, since the expert judgment may have the potential to be 
utilized in bringing improvement of the existing ranking function of Information Retrieval (IR). 
Especially, in the context of Malay Translated Hadith Information Retrieval as one example of Domain 
Specific Text Retrieval, they possess expertise in determining the relevance of hadith in specific topics and 
queries and moreover assessing the authenticity of hadith documents [2]. In this study, the researchers 
hypothesize that the crowd judges may be have the reliability to give the relevant judgment for the certain 
documents and queries, but it depend on their level of confidences and also the expertise. The level of the 
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crowd expertise and also their confidence may affect the quality of their judgment and bring the impact to 
performance of the IR system [1], if and only if the value of their judgment is taken as an indicator of the 
ranking function. This value of relevance judgment, confidence and the expertise can be represented in one 
single value using Z-Number, which becomes the input of the Fuzzy Logic Controller to optimize the 
Ranking Function of Domain Specific Text Retrieval. 

Zadeh et al. 2011, introduced the concept of Z-numbers to describe the uncertain information which 
is a more generalized notion. A Z-number is an ordered pair of fuzzy numbers (A, B) [3]. Here, A is a value 
of some variables and B represents an idea of certainty or other closely related concepts such as sureness, 
confidence, reliability, strength of truth, or probability [4]. It should be noted that in everyday decision 
making most decisions are in the form of Z-numbers. Various of researches were examined in application of 
Z-numbers in many areas that involve factors that are characterized by not only fuzziness, but also partial 
reliability, such as [4-12]. Their focus mainly on research of single or Multi-criteria decision-making 
problems and its application with Z-numbers. Other researches, besides decision making problem such as in 
[13] propose the application of Z-Number for modeling the effect of Pilates exercise on motivation, attention, 
anxiety, and educational achievement. In [14] proposes Sensor Data Fusion with Z-Numbers and Its 
Application in Fault Diagnosis using Sensor data fusion technology. Application of expert judgments in 
Information Retrieval can be seen in the work of [15-18] which focus on research of pseudo relevance 
judgments or evidence that will contribute at the end part of the information retrieval evaluation. 
In [19] proposes to use the Pairwise Preference technique to collect relevance judgments from a 
crowdsourcing platform in order to have a better understanding of the users’ perception of relevance and to 
collect data with high fidelity. In [20], focuses on modelling randomness in relevance judgments and 
evaluation measures, and recently in 2018, in [1] shows the potential of crowd judgments to be utilised for 
reliably ranking the IR systems, in purpose of evaluation, and closing the gap of judgement disagreement 
between experts and crowd workers. 


Translated Hadith) 
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Document Collection (Malay l 

I 

l 


Note: ——p> | Basic Process 


Figure 1. Conceptual Framework of FBMIR 


2. RESEARCH METHOD 
2.1. Framework 

Figure 1 illustrated all components in the framework of the Fuzzy BM25 Malay Information 
Retrieval System (FBMIR) of this research, that includes the basic of process of Information Retrieval such as 
Pre-Processing, Query Operation and Retrieval and Ranking Process, which new contributions in this 
research are done in this particular process. In the said Retrieval and Ranking Process, the researchers have 
optimized the BM25 ranking function, by applying the design of Hierarchical Fuzzy Logic System that 
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includes 4 Fuzzy Logic Controller (FLC), which each FLC has processes such as Fuzzification, Inference, 
and Defuzzification to produce their own output. 4 FLC have their own input consist of Ontology BM25 
Score, Fabrication Rate, Shia Rate and Positive Rate of hadith document, and lastly to include the Expert 
Judgment Z-Numbers as a new additional ranking indicator. The Evaluation process has been done by 
calculating the evaluation metric such as P@10, %no measures and MAP, and comparing the results of 
FBMIR with BM25 original score and Vector Model score. 


2.2. The system 

The system is designed based on the integration of the researchers’ two previous works which are, 
Domain specific concept ontologies and text summarization as hierarchical fuzzy logic ranking indicator on 
Malay text corpus in [2], and Positive Rate of Hadith in [21], with the new additional Expert Judgment with 
Z-Numbers that will be the main contribution in this article. The details of the design of this research will be 
explained as follows. 


2.2.1. Data collection 

The system basically uses the data of Malay Translated Hadith corpus that consists of 2026 Sahih 
Bukhari text documents, 160 Fabricated hadiths text document from book of Al-Manar al-Munif Fi al-Sahih 
wa-al-Dhoif by Ibn Qayyim al-Jawziyah, 1,270 Shia Hadiths from the Kithab Usul and Raudah of Kitab 
Al-Khafi, and 1728 documents from Kithab Al-Figh Al-Manhaji Mazhab Al-Syafie by Dr Zulkifli bin 
Mohamad al-Bakri, and also the hadith documents from the expert’s relevant judgment. 


2.2.2. Model for modification 
a. Ranking function 

The system will use the existing method of ranking function from the researchers’ previous 
works in [2], such as Ontology Score of BM25, that acts as the first input of ranking score in Hierarchical 
Fuzzy Logic System, that explain in Section b. The formula of the calculation of the Ontology Score of 
BM25, D will be as: 


While Q is the Query that consist the keywords of q1 until qn... 


n 
: f (qpD).k 4D) 
D,Q) = IDF (qj). 1 
score( Q) = (CR) (eer) ( ) 
Ontology BM25 Score = score.D,Q) + SONT(score€D,Q)) (2) 


Where, ONT (score (D, Q)) is an Accumulated of Ontology Score of Documents for given query. 


b. Hierarchical fuzzy logic system 

One advantage of Hierarchical Fuzzy Logic System is its capability to decrease the amount of fuzzy 
rules in the complete system, by separating the input in different Fuzzy Logic Controller (FLC), this 
approach also will decrease the computational time and ensuring the systems’ robustness and its efficiency, 
moreover proven to capture and manipulate the expert knowledge and input [22]. In this article, the 
researchers once again use the design of a Hierarchical Fuzzy Logic System, which has been done in the 
researchers’ previous work in [2, 21]. In [2], the researchers have explained the operation FLC1 and FLC2, 
which is in the Fuzzy Logic Controller (FLC1) of Mamdani Method, consist of two inputs such as such as 
Ontology BM25 Score and Fabrication Rate. For the FLC2, it consists of two inputs such as the output of 
FLC1 and Shia Rate. In [21], the researchers have explained the continuous design by adding the new FLC3, 
which consists of two inputs such as the output of FLC2 and Positive Rate of Hadith. In this article, the 
researchers have added a new FLC4, that run after FLC3 process that also consists of two inputs, such as the 
output of FLC3 and Expert Judgment Z-Numbers, and yields the output as the Final Ranking Score of the 
document shown in Figure 2. 
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Figure 2. Hierarchical Fuzzy Logic Controller of FBMIR 


Fuzzy Logic has a capability to be used in any retrieval model that possesses its own rule based and 
ws the description of the expertise in a more intuitive and more human-like manner, especially for fuzzy 


inference technique of Mamdani method, which the researchers use in this research. In this research, in every 
FLC that is mentioned in this section, involves the Fuzzy Processing, that includes processes of Fuzzification, 
Inference and Defuzzification to produce the result, we can see the examples of the application of this 


met 


hod in [23-25]. 

Fuzzification: The process of defining the degree of membership of a crisp value for each fuzzy set and 
the mathematical meaning of the linguistic variables. In this article, the researchers explain the linguistic 
variables as shows in Table 1 and Table 2, formulation and calculation of new Expert Judgment 
Z-numbers respectively that will be used in FLC4. The details of the linguistic variables, the formulation 
and the calculation of FLC1, FLC2, and FLC3 which can be seen in the researchers’ previous 
works in [2, 21]. In FLC4, the model uses one input which is the Expert Judgment Z-Numbers and four- 
output values of the Final Score Ranking which consists of three triangular membership functions. It also 
includes Linguistic variable consists of Final Ranking Score, and an attribute of Expert Judgment 
Z-Numbers, Four-output variables and eight input variables are identified and evaluated by the domain 
expert along with, the variables and range of possible value. 

Inference: The process of formulating the mapping from a given input to an output using fuzzy logic, 
right after the Fuzzification process. The mapping then provides a basis from which decisions can be 
made and give out the value for the variable of the fuzzy set. 

Defuzzification: The value from the Fuzzy Inference results will be the input for Defuzzification process. 
Defuzzification is the process of obtaining a single number (output) from the calculation 
of the aggregated fuzzy set. In this research, the researchers will use the center of area (COA) 
Defuzzification method. The COA method basically, returns the value of the center of area under the 
curve, as its method is to produce the calculation of final score of the ranking function. 

Rules: Fuzzy rules are used within fuzzy logic systems to infer an output based on input variables and it 
is important as a starting point of the Fuzzy Processing. In this research, the rules are produced by the 
extraction process of the experts’ knowledge by the experts which applies the Mamdani Type Rules of 
Fuzzy Logic. This takes two attributes for each Fuzzy Logic Controller of FLC4 such as, output of FLC3 
and Expert Judgment Score in Table 3. 


Table 1. Input linguistic variable 


Linguistic variable Value Range 
Output of FLC3 Zero [0, 0, 40] 
Low [20, 40, 60] 
High [40, 60, 80] 
Very High [60, 100, 100] 
Expert Judgment Score (A, B) Low [0, 0, 40] 
Medium [20, 40, 60] 
High [40, 60, 80] 
Very High [60, 100, 100] 


Table 2. Output linguistic variable 


Linguistic variable Value Range 
Final Ranking Score Zero [0, 0, 40.00] 
Low [20, 40, 60] 
High [40, 60, 80] 
Very High [60, 100, 100] 
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Table 3. Rules of FLC4 


Case Rules Final Ranking Score 
1 if Expert = VH & FLC3 =L High 

2 if Expert = VH & FLC3= M Very High 
3 if Expert = VH & FLC3= H Very High 
4 if Expert = VH & FLC3= VH Very High 
5 If Expert = H & FLC3 =L Low 

6 If Expert = H & FLC3 = M High 

q If Expert = H & FLC3 =H High 
8 If Expert = H & FLC3 = VH Very High 
9 if Expert = M & FLC3=L Low 

10 if Expert = M & FLC3=M Low 

11 if Expert = M & FLC3 =H Low 

12 if Expert=M & FLC3= VH High 

13 if Expert= L & FLC3= L Zero 

14 if Expert= L & FLC3= M Low 

15 if Expert= L & FLC3= H Low 

16 if Expert= L & FLC3= VH High 


2.3. Proposed Calculation of Z-Numbers 

In this article, the researchers investigate three factors which may correlate in the implementation of 
this Expert Judgment Z-Numbers as Fuzzy Logic Ranking Indicator such as: experts’ relevance judgment or 
score, the experts’ confidence and the level of expertise. Following this, we represent the two factors in Z 
numbers Z= (A, B). Table 4 shown the first component A, a restriction on the values, is a real-valued 
uncertain variable X or in this case the researchers take the score on relevance judgment from 0.1 till 1.0. 
Table 5 shown the second component B is a measure of reliability ofthe first component or in this study we 
take the confidence of their judgment, which in methodology as the researchers take the scale and approach 
by [5]. This number will be converted into the crisp number by using the defuzzification approach that is 
suggested by [9]: 
Step 1: Determine the weights of evaluation criteria. The weight of evaluation criteria are employed as same 
value, 0.2 for each criterion, with maximum is 1. 
Step 2: Construct the fuzzy decision matrix for alternatives’ evaluation. 


Table 4. Linguistic Terms and Their Corresponding Generalised Fuzzy Numbers [9] 


Linguistic terms Generalised fuzzy numbers 
Absolutely-low (AL) (0.0, 0.0, 0.0, 0.0; 1) 
Very-low (VL) (0.0,0.0, 0.02, 0.0731) 
Low (L) (0.04, 0.10, 0.18, 0.23; 1) 
Fairly-low (FL) (0.17, 0.22, 0.36, 0.42; 1) 
Medium (M) (0.32, 0.41, 0.58, 0.6; 1) 
Fairly-high (FH) (0.58, 0.63, 0.80, 0.86; 1) 
High (H) (0.72, 0.78, 0.92, 0.97; 1) 
Very-high (VH) (0.93, 0.98, 1.0, 1.0; 1) 
Absolutely-high (AH) (1.0, 1.0, 1.0, 1.0; 1) 


Table 5. Reliability Linguistic Terms and Their Corresponding Z-Numbers [5]. 


Linguistic terms Generalised fuzzy numbers 
Very-low (VL) (0,0,0,0.25;1) 

Low (L) (0,25,0.25,0.5;1) 
Medium (M) (0.25,0.5,0.5,0.75;1) 
High (H) (0.5,0.75,0.75,1;1) 

Very-high (VH) (0.75,1,1,1;1) 


Step 3: Convert the z-numbers into regular fuzzy numbers. 
The fuzzy decision matrices of preferences of z-numbers are converted and aggregated using main (3) and 
multiples equation and theorem that suggested by [9], that can be computed as 


oa 2(a, +a,)+7(a, +a,) Th; 
wee, 5,)=[ ( 1 a ( 2, a) a 


(3) 
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where 
X : the centroid point on the horizontal x-axis 
y : the centroid point on the vertical y-axis 


(x, ¥) : the centroid coordinate of fuzzy number A 


Step 4: Z-numbers is averaged and normalized as a new value of Expert Judgment Score for a particular 
document. 

In appearance with many expertise, the system will calculate the value of the average of Z-numbers, 
according to their level of expertise such as (PhD, Master, Degree and User), as the final Z-numbers. For 
calculation of the ranking of the certain documents, the system will utilise the value of the higher level of 
expertise Z-numbers, to represent the relevance judgment of a particular document. For example, if the 
average of Z-numbers of PhD available, the system will use this value instead of another category. If the 
value from PhD is not available, the system will use the value from the Master and so on. The priority of this 
selection is determined by the experts of hadiths that are pertaining to Malaysian’ scenario. The calculation 
of average of Z-numbers then will be used as an input for a Fuzzy Logic Controller 4 or FLC4. 

Step 5: Expert Judgment Score will be an input for Fuzzy Logic Controller 4 or FLC4. 

The Z-number is then involved in Fuzzy Processing in FLC4, to yield the result as the researchers mentioned 

in previous section. 


3. RESULTS AND DISCUSSION 

In this research, eight experts from four categories (PhD, Master, Degree and User), which means, 
two experts from each category, are involved in this research to judge the hadith documents on eight topics, 
which consist of the total of 30 queries, with average three queries for each topic [2]. The relevance of the 
documents, possible range of Fuzzy Logic Controller, priority selection of Z-numbers value that have been 
used in this research were determined and examined by two experts, Dr Ahmad Yunus Mohd Noor from 
Universiti Kebangsaan Malaysia and Dr Zulhilmi bin Mohamed Nor from Universiti Sains Islam Malaysia. 
The Evaluation process has been done by the calculation of the evaluation metric such as P@ 10 (Precision at 
Rank 10), %no measures (the percentage of the query with no relevant document in the top ten retrieved) and 
MAP (Mean Average Precision), and the comapriso between the results of FBMIR with BM25 original score 
and Vector Model score. The results are shown in the Table 6 as follows. 


Table 6. Results of Experiment 


S a g , 
1S) Ro iS) i) af w RK & & 

wm § 36 & & ge 82 82 88 =F 5 
© SS 8 § = x x Se Ss Ls 

& S & cS 
it Makanan 1.0 0.8 0.466 0 20 53.4 1.0 0.82 0.502 
2 Adab 0.875 1 0.75 12.5 0 25 0.89 1 0.715 
3 Solat 0.9 0.476 0.58 10 52.4 42 0.8765 0.486 0.584 
4 Iman 1.0 0.466 0.444 0 53.4 55.6 1.0 0.463 0.565 
5 Bersuci 0.95 0.937 0.513 5 6.3 48.7 0.952 0.968 0.534 
6 Ibadah 0.947 0.35 0.541 5.3 65 45.9 0.9557 0.725 0.577 
7 Sirah 1 0.733 0.733 0 26.7 26.7 1 0.964 0.905 
8 Umum 0.8980 0.333 0.353 10.2 66.7 74.7 0.8596 0.324 0.382 


Based on Table 6, in term of Topics of the query, the proposed system, with the additional new 
indicator of Expert Judgment Z-Numbers, has a better result on seven topic sets such as “Makanan”,“‘Solat”’, 
“Iman”’,’’Bersuci’”, “Ibadah”’, “Sirah” and “Umum” on the P@10 and %no measures. For MAP alone, 
FBMIR has a better result on six topic sets of query results such as “Makanan’”, “Solat”, “Iman”, “Ibadah”’, 
“Sirah” and “Umum” on the P@10 and %no measures. BM25 original score only has better result on one 
topic set of “Adab” whereas MAP alone has better result in two topic sets of “Adab”’and “Bersuci”’. 

In term of the overall queries, the proposed system, with the additional new indicator of Expert 
Judgment Z-Numbers, may bring the improvement to the original BM25 ranking function, by yielding better 
results on 26 queries, on all evaluation metrics that are measured in this research such as P@10, %no 
measures and MAP, and has achieved the better results in 28 queries on P@10 alone, compared to the BM25 
original score, that only yield better result in two queries of all evaluation metrics and four queries on the 
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MAP alone. In this research, Vector Space Model has not outperformed FBMIR and BM25 Original Score in 
any queries. With the application of Ontology BM25 Score, Fabrication Rate, Shia Rate, Positive of FLC1, 
FLC2 and FLC3 that the researchers reported in [21], the proposed system has a better result on five topics 
and 26 queries. With the additional of the Expert Judgment with Z-Numbers in this research, the results are 
improved when the proposed system yields better results on seven topics and 26 queries compared to the 
other model. One important finding from this experiment is that the results will be different if the 
Hierarchical Fuzzy Logic are run with different order. For example, the results are getting worse if the 
Positive Rate is taken as an input in FLC1 and the Fabricated and Shia rate in FLC2 and FLC3, compared to 
the order that the researchers suggest in section 2.2. Moreover, the results or the ranking list that are retrieved 
after FLC3, will make the experts easier to judge, because the ranking list of hadith are optimized earlier and 
will make most relevance documents are already in the top of the list, due to the processing of FLC1, FLC2 
and FLC3. For future works, the researchers want to apply the different logic or calculation to manipulate the 
all value from different categories of expert judgment, instead of using the value of higher categories of 
expert that the researchers have done in this particular research. 


4. CONCLUSION 

This articles presented the implementation of Expert Judgment Z-Numbers as a Ranking Indicator 
for Hierarchical Fuzzy Logic System. The researchers have proposed the design of the Hierarchical Fuzzy 
Logic System with the additional new indicator such as Expert Judgment Z-Numbers that enhances the 
results of the researchers’ previous works in [2, 21]. In this article, the researchers investigate three factors 
which may correlate in the implementation of this Expert Judgment Z-Numbers as Fuzzy Logic Ranking 
Indicator such as: expert relevance judgment, the expert confidence and the level of expertise. Following this, 
we represent the two factors in Z numbers Z= (A, B). The Expert Judgment Z-Numbers then will be an input 
to Hierarchical Fuzzy Logic System of Domain Specific Text Retrieval, along with other indicators such as 
Ontology BM25 Score, Fabrication Rate, Shia Rate and Positive Rate of hadith document. The results 
showed, the proposed system, with the additional new indicator of Expert Judgment Z-Numbers, may bring 
the improvement to the original BM25 ranking function, by yield better result on 26 queries, on all evaluation 
metric that measure in this research such as P@10, %no measures and MAP, and has achieved better results 
in 28 queries on P@10 alone, compare to the BM25 original score, that only yield better result in two queries 
of all evaluation metrics, and also yield better result in four queries on the MAP alone. For the further 
enhancement, the researcher want to apply and propose new calculation and new logic to manipulate the all 
value from different categories of expert judgment, instead use the priority of value of higher categories of 
expert that the researchers have done in this particular research. The results showed that Expert Judgment Z- 
Numbers as Fuzzy Logic Ranking Indicator offers a promising method to leverage the crowd in combination 
of trusted judges, their confidence and also their level of expertise for accurate and affordable building of IR 
test collections and may elevate the performance of ranking Function for Domain Specific Text Retrieval. 
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